{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Split Learning—Bank Marketing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">The following codes are demos only. It's **NOT for production** due to system security concerns, please **DO NOT** use it directly in production."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this tutorial, we will use the bank's marketing model as an example to show how to accomplish split learning in vertical scenarios under the `SecretFlow` framework.\n",
    "`SecretFlow` provides a user-friendly Api that makes it easy to apply your Keras model or PyTorch model to split learning scenarios to complete joint modeling tasks for vertical scenarios.\n",
    "\n",
    "In this tutorial we will show you how to turn your existing 'Keras' model into a split learning model under `Secretflow` to complete federated multi-party modeling tasks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is Split Learning？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The core idea of split learning is split the network structure. Each device (silo) retains only a part of the network structure, and the sub-network structure of all devices is combined together to form a complete network model. \n",
    "In the training process, different devices (silo) only perform forward or reverse calculation on the local network structure, and transfer the calculation results to the next device. Multiple devices complete the training through joint model until convergence.\n",
    "\n",
    " <img alt=\"split_learning_tutorial.png\" src=\"resources/split_learning_tutorial.png\" width=\"600\">  \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Alice**：have *data\\_alice*，*model\\_base\\_alice*  \n",
    "**Bob**: have *data\\_bob*，*model\\_base\\_bob*，*model\\_fuse*  \n",
    "\n",
    "1. **Alice** uses its data to get *hidden0* through *model\\_base\\_Alice* and send it to Bob. \n",
    "2. **Bob** gets *hidden1* with its data through *model\\_base\\_bob*.\n",
    "3. *hidden\\_0* and *hidden\\_1* are input to the *AggLayer* for aggregation, and the aggregated *hidden\\_merge* is the output.\n",
    "4. **Bob** input *hidden\\_merge* to *model\\_fuse*, get the gradient with *label* and send it back.\n",
    "5. The gradient is split into two parts *g\\_0*, *g\\_1* through *AggLayer*, which are sent to **Alice** and **Bob** respectively.\n",
    "6. Then **Alice** and **Bob** update their local base net with *g\\_0* or *g\\_1*.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Marketing is the banking industry in the ever-changing market environment, to meet the needs of customers, to achieve business objectives of the overall operation and sales activities. In the current environment of big data, data analysis provides a more effective analysis means for the banking industry. Customer demand analysis, understanding of target market trends and more macro market strategies can provide the basis and direction.  \n",
    "  \n",
    "The data from [kaggle](https://www.kaggle.com/janiobachmann/bank-marketing-dataset) is a set of classic marketing data bank, is a Portuguese bank agency telephone direct marketing activities, The target variable is whether the customer subscribes to deposit product."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data\n",
    "\n",
    "1. The total sample size was 11162, including 8929 training set and 2233 test set\n",
    "2. Feature dim is 16, target is binary classification\n",
    "3. We have cut the data in advance. Alice holds the 4-dimensional basic attribute features, Bob holds the 12-dimensional bank transaction features, and only Alice holds the corresponding label"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's start by looking at what our bank's marketing data look like?  \n",
    "\n",
    "The original data is divided into Bank Alice and Bank Bob, which stores in Alice and Bob respectively. Here, CSV is the original data that has only been separated without pre-processing, we will use `secretflow preprocess` for FedData preprocess"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-09-07 15:08:41.518030: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n"
     ]
    }
   ],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import secretflow as sf\n",
    "\n",
    "sf.init(['alice', 'bob'], num_cpus=8, log_to_driver=True)\n",
    "alice, bob = sf.PYU('alice'), sf.PYU('bob')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### prepare data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from secretflow.utils.simulation.datasets import dataset\n",
    "\n",
    "df = pd.read_csv(dataset('bank_marketing'), sep=';')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We assume that Alice is a new bank, and they only have the basic information of the user and purchased the label of financial products from other bank."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>job</th>\n",
       "      <th>marital</th>\n",
       "      <th>education</th>\n",
       "      <th>y</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>30</td>\n",
       "      <td>unemployed</td>\n",
       "      <td>married</td>\n",
       "      <td>primary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>33</td>\n",
       "      <td>services</td>\n",
       "      <td>married</td>\n",
       "      <td>secondary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>35</td>\n",
       "      <td>management</td>\n",
       "      <td>single</td>\n",
       "      <td>tertiary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>30</td>\n",
       "      <td>management</td>\n",
       "      <td>married</td>\n",
       "      <td>tertiary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>59</td>\n",
       "      <td>blue-collar</td>\n",
       "      <td>married</td>\n",
       "      <td>secondary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516</th>\n",
       "      <td>33</td>\n",
       "      <td>services</td>\n",
       "      <td>married</td>\n",
       "      <td>secondary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4517</th>\n",
       "      <td>57</td>\n",
       "      <td>self-employed</td>\n",
       "      <td>married</td>\n",
       "      <td>tertiary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4518</th>\n",
       "      <td>57</td>\n",
       "      <td>technician</td>\n",
       "      <td>married</td>\n",
       "      <td>secondary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4519</th>\n",
       "      <td>28</td>\n",
       "      <td>blue-collar</td>\n",
       "      <td>married</td>\n",
       "      <td>secondary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4520</th>\n",
       "      <td>44</td>\n",
       "      <td>entrepreneur</td>\n",
       "      <td>single</td>\n",
       "      <td>tertiary</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4521 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      age            job  marital  education   y\n",
       "0      30     unemployed  married    primary  no\n",
       "1      33       services  married  secondary  no\n",
       "2      35     management   single   tertiary  no\n",
       "3      30     management  married   tertiary  no\n",
       "4      59    blue-collar  married  secondary  no\n",
       "...   ...            ...      ...        ...  ..\n",
       "4516   33       services  married  secondary  no\n",
       "4517   57  self-employed  married   tertiary  no\n",
       "4518   57     technician  married  secondary  no\n",
       "4519   28    blue-collar  married  secondary  no\n",
       "4520   44   entrepreneur   single   tertiary  no\n",
       "\n",
       "[4521 rows x 5 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "alice_data = df[[\"age\", \"job\", \"marital\", \"education\", \"y\"]]\n",
    "alice_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bob is an old bank, they have the user's account balance, house, loan, and recent marketing feedback"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>default</th>\n",
       "      <th>balance</th>\n",
       "      <th>housing</th>\n",
       "      <th>loan</th>\n",
       "      <th>contact</th>\n",
       "      <th>day</th>\n",
       "      <th>month</th>\n",
       "      <th>duration</th>\n",
       "      <th>campaign</th>\n",
       "      <th>pdays</th>\n",
       "      <th>previous</th>\n",
       "      <th>poutcome</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>no</td>\n",
       "      <td>1787</td>\n",
       "      <td>no</td>\n",
       "      <td>no</td>\n",
       "      <td>cellular</td>\n",
       "      <td>19</td>\n",
       "      <td>oct</td>\n",
       "      <td>79</td>\n",
       "      <td>1</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>no</td>\n",
       "      <td>4789</td>\n",
       "      <td>yes</td>\n",
       "      <td>yes</td>\n",
       "      <td>cellular</td>\n",
       "      <td>11</td>\n",
       "      <td>may</td>\n",
       "      <td>220</td>\n",
       "      <td>1</td>\n",
       "      <td>339</td>\n",
       "      <td>4</td>\n",
       "      <td>failure</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>no</td>\n",
       "      <td>1350</td>\n",
       "      <td>yes</td>\n",
       "      <td>no</td>\n",
       "      <td>cellular</td>\n",
       "      <td>16</td>\n",
       "      <td>apr</td>\n",
       "      <td>185</td>\n",
       "      <td>1</td>\n",
       "      <td>330</td>\n",
       "      <td>1</td>\n",
       "      <td>failure</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>no</td>\n",
       "      <td>1476</td>\n",
       "      <td>yes</td>\n",
       "      <td>yes</td>\n",
       "      <td>unknown</td>\n",
       "      <td>3</td>\n",
       "      <td>jun</td>\n",
       "      <td>199</td>\n",
       "      <td>4</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>no</td>\n",
       "      <td>0</td>\n",
       "      <td>yes</td>\n",
       "      <td>no</td>\n",
       "      <td>unknown</td>\n",
       "      <td>5</td>\n",
       "      <td>may</td>\n",
       "      <td>226</td>\n",
       "      <td>1</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516</th>\n",
       "      <td>no</td>\n",
       "      <td>-333</td>\n",
       "      <td>yes</td>\n",
       "      <td>no</td>\n",
       "      <td>cellular</td>\n",
       "      <td>30</td>\n",
       "      <td>jul</td>\n",
       "      <td>329</td>\n",
       "      <td>5</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4517</th>\n",
       "      <td>yes</td>\n",
       "      <td>-3313</td>\n",
       "      <td>yes</td>\n",
       "      <td>yes</td>\n",
       "      <td>unknown</td>\n",
       "      <td>9</td>\n",
       "      <td>may</td>\n",
       "      <td>153</td>\n",
       "      <td>1</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4518</th>\n",
       "      <td>no</td>\n",
       "      <td>295</td>\n",
       "      <td>no</td>\n",
       "      <td>no</td>\n",
       "      <td>cellular</td>\n",
       "      <td>19</td>\n",
       "      <td>aug</td>\n",
       "      <td>151</td>\n",
       "      <td>11</td>\n",
       "      <td>-1</td>\n",
       "      <td>0</td>\n",
       "      <td>unknown</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4519</th>\n",
       "      <td>no</td>\n",
       "      <td>1137</td>\n",
       "      <td>no</td>\n",
       "      <td>no</td>\n",
       "      <td>cellular</td>\n",
       "      <td>6</td>\n",
       "      <td>feb</td>\n",
       "      <td>129</td>\n",
       "      <td>4</td>\n",
       "      <td>211</td>\n",
       "      <td>3</td>\n",
       "      <td>other</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4520</th>\n",
       "      <td>no</td>\n",
       "      <td>1136</td>\n",
       "      <td>yes</td>\n",
       "      <td>yes</td>\n",
       "      <td>cellular</td>\n",
       "      <td>3</td>\n",
       "      <td>apr</td>\n",
       "      <td>345</td>\n",
       "      <td>2</td>\n",
       "      <td>249</td>\n",
       "      <td>7</td>\n",
       "      <td>other</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4521 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     default  balance housing loan   contact  day month  duration  campaign  \\\n",
       "0         no     1787      no   no  cellular   19   oct        79         1   \n",
       "1         no     4789     yes  yes  cellular   11   may       220         1   \n",
       "2         no     1350     yes   no  cellular   16   apr       185         1   \n",
       "3         no     1476     yes  yes   unknown    3   jun       199         4   \n",
       "4         no        0     yes   no   unknown    5   may       226         1   \n",
       "...      ...      ...     ...  ...       ...  ...   ...       ...       ...   \n",
       "4516      no     -333     yes   no  cellular   30   jul       329         5   \n",
       "4517     yes    -3313     yes  yes   unknown    9   may       153         1   \n",
       "4518      no      295      no   no  cellular   19   aug       151        11   \n",
       "4519      no     1137      no   no  cellular    6   feb       129         4   \n",
       "4520      no     1136     yes  yes  cellular    3   apr       345         2   \n",
       "\n",
       "      pdays  previous poutcome  \n",
       "0        -1         0  unknown  \n",
       "1       339         4  failure  \n",
       "2       330         1  failure  \n",
       "3        -1         0  unknown  \n",
       "4        -1         0  unknown  \n",
       "...     ...       ...      ...  \n",
       "4516     -1         0  unknown  \n",
       "4517     -1         0  unknown  \n",
       "4518     -1         0  unknown  \n",
       "4519    211         3    other  \n",
       "4520    249         7    other  \n",
       "\n",
       "[4521 rows x 12 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bob_data = df[[\"default\", \"balance\", \"housing\", \"loan\", \"contact\", \n",
    "             \"day\",\"month\",\"duration\",\"campaign\",\"pdays\",\"previous\",\"poutcome\"]]\n",
    "bob_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Secretflow Environment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create 2 entities in the Secretflow environment [Alice, Bob]\n",
    "Where 'Alice' and 'Bob' are two PYU\n",
    "Once you've constructed the two objects, you can happily start Splitting Learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import Dependency"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from secretflow.data.split import train_test_split\n",
    "from secretflow.ml.nn import SLModel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Build Federated Table**\n",
    "\n",
    "\n",
    "Federated table is a virtual concept that cross multiple parties, We define `VDataFrame` for vertical setting\n",
    "\n",
    "1. The data of all parties in a federated table is stored locally and is not allowed to go out of the domain.\n",
    "\n",
    "2. No one has access to data store except the party that owns the data.\n",
    "\n",
    "3. Any operation of the federated table will be scheduled by the driver to each worker, and the execution instructions will be delivered layer by layer until the Python Runtime of the specific worker. The framework ensures that only `worker.device` and `Object`. device can operate data at the same time.\n",
    "\n",
    "4. Federated tables are designed to management and manipulation multi-party data from a central perspective.\n",
    "\n",
    "5. Interfaces to `Federated Table` are aligned to `pandas.DataFrame` to reduce the cost of multi-party data operations.\n",
    "\n",
    "6. The SecretFlow framework provides Plain&Ciphertext hybrid programming capabilities. Vertical federated tables are built using `SPU`, and `MPC-PSI` is used to safely get intersection and align data from all parties.\n",
    "\n",
    "<img alt=\"vdataframe.png\" src=\"resources/vdataframe.png\" width=\"600\">  \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "VDataFrame provides `read_csv` interface similar to pandas, except that `secretflow.read_csv` receives a dictionary that defines the path of data for both parties. We can use `secretflow.vertical.read_csv` to build the `VDataFrame`.\n",
    "```\n",
    "read_csv(file_dict,delimiter,ppu,keys,drop_key)\n",
    "    filepath: Path of the participant file. The address can be a relative or absolute path to a local file\n",
    "    ppu: PPU Device for PSI; If this parameter is not specified, data must be prealigned\n",
    "    keys: Key for intersection\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create spu object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "from secretflow.utils.simulation.datasets import load_bank_marketing\n",
    "\n",
    "# Alice has the first four features,\n",
    "# while bob has the left features\n",
    "data = load_bank_marketing(parts={alice: (0, 4), bob: (4, 16)}, axis=1)\n",
    "# Alice holds the label.\n",
    "label = load_bank_marketing(parts={alice: (16, 17)}, axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`data` is a vertically federated table. It only has the `Schema` of all the data globally"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a closer look at VDF data management \n",
    "\n",
    "As can be seen from an example, the `age` field belongs to Alice, so the corresponding column can be obtained in the partition of Alice, but Bob will report `KeyError` error when trying to obtain age.  \n",
    "There is a concept of `Partition`, which is a data fragment defined by us. Each Partition has its own device to which it belongs, and only the device that belongs can operate data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(pid=124081)\u001b[0m 2022-09-07 15:09:03.977270: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n",
      "\u001b[2m\u001b[36m(pid=124080)\u001b[0m 2022-09-07 15:09:04.001087: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<secretflow.device.device.pyu.PYUObject object at 0x7f4b61e8c5e0>\n"
     ]
    },
    {
     "ename": "KeyError",
     "evalue": "<secretflow.device.device.pyu.PYU object at 0x7f4bb43235e0>",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "Input \u001b[0;32mIn [8]\u001b[0m, in \u001b[0;36m<cell line: 2>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;28mprint\u001b[39m(data[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mage\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mpartitions[alice]\u001b[38;5;241m.\u001b[39mdata)\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mdata\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mage\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m]\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpartitions\u001b[49m\u001b[43m[\u001b[49m\u001b[43mbob\u001b[49m\u001b[43m]\u001b[49m)\n",
      "\u001b[0;31mKeyError\u001b[0m: <secretflow.device.device.pyu.PYU object at 0x7f4bb43235e0>"
     ]
    }
   ],
   "source": [
    "print(data['age'].partitions[alice].data)\n",
    "print(data['age'].partitions[bob])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then do data preprocessing on the `VDataFrame`.。  \n",
    "Here we take `LabelEncoder` and `MinMaxScaler` as examples. These two preprocessor functions have corresponding concepts in `SkLearn` and their use methods are similar to those in `SkLearn`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(SPURuntime pid=124081)\u001b[0m I0907 15:09:05.583189 124081 external/com_github_brpc_brpc/src/brpc/server.cpp:1066] Server[yasl::link::internal::ReceiverServiceImpl] is serving on port=12485.\n",
      "\u001b[2m\u001b[36m(SPURuntime pid=124081)\u001b[0m I0907 15:09:05.583254 124081 external/com_github_brpc_brpc/src/brpc/server.cpp:1069] Check out http://i85c08157.eu95sqa:12485 in web browser.\n",
      "\u001b[2m\u001b[36m(SPURuntime pid=124080)\u001b[0m I0907 15:09:05.607770 124080 external/com_github_brpc_brpc/src/brpc/server.cpp:1066] Server[yasl::link::internal::ReceiverServiceImpl] is serving on port=57321.\n",
      "\u001b[2m\u001b[36m(SPURuntime pid=124080)\u001b[0m I0907 15:09:05.607844 124080 external/com_github_brpc_brpc/src/brpc/server.cpp:1069] Check out http://i85c08157.eu95sqa:57321 in web browser.\n",
      "\u001b[2m\u001b[36m(SPURuntime pid=124081)\u001b[0m I0907 15:09:05.683906 124435 external/com_github_brpc_brpc/src/brpc/socket.cpp:2202] Checking Socket{id=0 addr=127.0.0.1:57321} (0x564890479a40)\n",
      "\u001b[2m\u001b[36m(SPURuntime pid=124081)\u001b[0m I0907 15:09:05.684046 124435 external/com_github_brpc_brpc/src/brpc/socket.cpp:2262] Revived Socket{id=0 addr=127.0.0.1:57321} (0x564890479a40) (Connectable)\n"
     ]
    }
   ],
   "source": [
    "from secretflow.preprocessing.scaler import MinMaxScaler\n",
    "from secretflow.preprocessing.encoder import LabelEncoder"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "encoder = LabelEncoder()\n",
    "data['job'] = encoder.fit_transform(data['job'])\n",
    "data['marital'] = encoder.fit_transform(data['marital'])\n",
    "data['education'] = encoder.fit_transform(data['education'])\n",
    "data['default'] = encoder.fit_transform(data['default'])\n",
    "data['housing'] = encoder.fit_transform(data['housing'])\n",
    "data['loan'] = encoder.fit_transform(data['loan'])\n",
    "data['contact'] = encoder.fit_transform(data['contact'])\n",
    "data['poutcome'] = encoder.fit_transform(data['poutcome'])\n",
    "data['month'] = encoder.fit_transform(data['month'])\n",
    "label = encoder.fit_transform(label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "label= <class 'secretflow.data.vertical.dataframe.VDataFrame'>,\n",
      "data = <class 'secretflow.data.vertical.dataframe.VDataFrame'>\n"
     ]
    }
   ],
   "source": [
    "print(f\"label= {type(label)},\\ndata = {type(data)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Standardize data via MinMaxScaler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(_run pid=119631)\u001b[0m /home/xingmeng.zhxm/anaconda3/envs/secretflow/lib/python3.8/site-packages/sklearn/base.py:443: UserWarning: X has feature names, but MinMaxScaler was fitted without feature names\n",
      "\u001b[2m\u001b[36m(_run pid=119631)\u001b[0m   warnings.warn(\n",
      "\u001b[2m\u001b[36m(_run pid=119633)\u001b[0m /home/xingmeng.zhxm/anaconda3/envs/secretflow/lib/python3.8/site-packages/sklearn/base.py:443: UserWarning: X has feature names, but MinMaxScaler was fitted without feature names\n",
      "\u001b[2m\u001b[36m(_run pid=119633)\u001b[0m   warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "scaler = MinMaxScaler()\n",
    "\n",
    "data = scaler.fit_transform(data)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we divide the data set into train-set and test-set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "from secretflow.data.split import train_test_split\n",
    "random_state = 1234\n",
    "train_data,test_data = train_test_split(data, train_size=0.8, random_state=random_state)\n",
    "train_label,test_label = train_test_split(label, train_size=0.8, random_state=random_state)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Summary:** At this point, we have completed the definition of **federated tables**, **data preprocessing**, and **training set and test set partitioning**\n",
    "The secretFlow framework defines a set of operations to be built on the federated table (its logical counterpart is `pandas.DataFrame`). The secretflow framework defines a set of operations to be built on the federated table (its logical counterpart is `sklearn`) Refer to our documentation and API introduction to learn more about other features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduce Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**local version**: \n",
    "For this task, a basic DNN can be completed, input 16-dimensional features, through a DNN network, output the probability of positive and negative samples.\n",
    "\n",
    "\n",
    "**Federate version**：\n",
    "* Alice：\n",
    "    - base_net: Input 4-dimensional feature and go through a DNN network to get hidden\n",
    "    - fuse_net: Receive hidden features calculated by Alice and Bob, input them to FUSENET for feature fusion, and complete the forward process and backward process\n",
    "* Bob：\n",
    "    - base_net: Input 12-dimensional features, get hidden through a DNN network, and then send hidden to Alice to complete the following operation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we start creating the federated model \n",
    "we define SLTFModel and SLTorchModel(WIP), which are used to build split learning of vertical scene. We define a simple and easy to use extensible interface, which can easily transform your existing Model into SF-Model, and then conduct vertical scene federation modeling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Split learning is to break up a model so that one part is executed locally on the data and the other part is executed on the label side.\n",
    "First let's define the locally executed model -- base_model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_base_model(input_dim, output_dim,  name='base_model'):\n",
    "    # Create model\n",
    "    def create_model():\n",
    "        from tensorflow import keras\n",
    "        from tensorflow.keras import layers\n",
    "        import tensorflow as tf\n",
    "        model = keras.Sequential(\n",
    "            [\n",
    "                keras.Input(shape=input_dim),\n",
    "                layers.Dense(100,activation =\"relu\" ),\n",
    "                layers.Dense(output_dim, activation=\"relu\"),\n",
    "            ]\n",
    "        )\n",
    "        # Compile model\n",
    "        model.summary()\n",
    "        model.compile(loss='binary_crossentropy',\n",
    "                      optimizer='adam',\n",
    "                      metrics=[\"accuracy\",tf.keras.metrics.AUC()])\n",
    "        return model\n",
    "    return create_model\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use create_base_model to create their base models for 'Alice' and 'Bob', respectively"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# prepare model\n",
    "hidden_size = 64\n",
    "\n",
    "model_base_alice = create_base_model(4, hidden_size)\n",
    "model_base_bob = create_base_model(12, hidden_size)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-09-07 15:09:22.473998: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n",
      "2022-09-07 15:09:22.474030: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"sequential\"\n",
      "_________________________________________________________________\n",
      " Layer (type)                Output Shape              Param #   \n",
      "=================================================================\n",
      " dense (Dense)               (None, 100)               500       \n",
      "                                                                 \n",
      " dense_1 (Dense)             (None, 64)                6464      \n",
      "                                                                 \n",
      "=================================================================\n",
      "Total params: 6,964\n",
      "Trainable params: 6,964\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n",
      "Model: \"sequential_1\"\n",
      "_________________________________________________________________\n",
      " Layer (type)                Output Shape              Param #   \n",
      "=================================================================\n",
      " dense_2 (Dense)             (None, 100)               1300      \n",
      "                                                                 \n",
      " dense_3 (Dense)             (None, 64)                6464      \n",
      "                                                                 \n",
      "=================================================================\n",
      "Total params: 7,764\n",
      "Trainable params: 7,764\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.engine.sequential.Sequential at 0x7f4b61e31100>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_base_alice()\n",
    "model_base_bob()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we define the side with the label, or the server-side model -- fuse_model\n",
    "In the definition of fuse_model, we need to correctly define `loss`, `optimizer`, and `metrics`. This is compatible with all configurations of your existing Keras model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_fuse_model(input_dim, output_dim, party_nums, name='fuse_model'):\n",
    "    def create_model():\n",
    "        from tensorflow import keras\n",
    "        from tensorflow.keras import layers\n",
    "        import tensorflow as tf\n",
    "        # input\n",
    "        input_layers = []\n",
    "        for i in range(party_nums):\n",
    "            input_layers.append(keras.Input(input_dim,))\n",
    "        \n",
    "        merged_layer = layers.concatenate(input_layers)\n",
    "        fuse_layer = layers.Dense(64, activation='relu')(merged_layer)\n",
    "        output = layers.Dense(output_dim, activation='sigmoid')(fuse_layer)\n",
    "\n",
    "        model = keras.Model(inputs=input_layers, outputs=output)\n",
    "        model.summary()\n",
    "        \n",
    "        model.compile(loss='binary_crossentropy',\n",
    "                      optimizer='adam',\n",
    "                      metrics=[\"accuracy\",tf.keras.metrics.AUC()])\n",
    "        return model\n",
    "    return create_model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_fuse = create_fuse_model(\n",
    "    input_dim=hidden_size, party_nums=2, output_dim=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"model\"\n",
      "__________________________________________________________________________________________________\n",
      " Layer (type)                   Output Shape         Param #     Connected to                     \n",
      "==================================================================================================\n",
      " input_3 (InputLayer)           [(None, 64)]         0           []                               \n",
      "                                                                                                  \n",
      " input_4 (InputLayer)           [(None, 64)]         0           []                               \n",
      "                                                                                                  \n",
      " concatenate (Concatenate)      (None, 128)          0           ['input_3[0][0]',                \n",
      "                                                                  'input_4[0][0]']                \n",
      "                                                                                                  \n",
      " dense_4 (Dense)                (None, 64)           8256        ['concatenate[0][0]']            \n",
      "                                                                                                  \n",
      " dense_5 (Dense)                (None, 1)            65          ['dense_4[0][0]']                \n",
      "                                                                                                  \n",
      "==================================================================================================\n",
      "Total params: 8,321\n",
      "Trainable params: 8,321\n",
      "Non-trainable params: 0\n",
      "__________________________________________________________________________________________________\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.engine.functional.Functional at 0x7f4a84633460>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model_fuse()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Split Learning Model\n",
    "Secretflow provides the split learning model `SLModel`\n",
    "To initial SLModel only need 3 parameters\n",
    "* base_model_dict: A dictionary needs to be passed in all clients participating in the training along with base_model mappings\n",
    "* device_y: PYU, which device has label\n",
    "* model_fuse: The fusion model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define base_model_dict  \n",
    "```python\n",
    "base_model_dict:Dict[PYU,model_fn]\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_model_dict = {\n",
    "    alice: model_base_alice,\n",
    "    bob:   model_base_bob\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "from secretflow.security.privacy import DPStrategy, GaussianEmbeddingDP, LabelDP\n",
    "\n",
    "# Define DP operations\n",
    "train_batch_size = 128\n",
    "gaussian_embedding_dp = GaussianEmbeddingDP(\n",
    "    noise_multiplier=0.5,\n",
    "    l2_norm_clip=1.0,\n",
    "    batch_size=train_batch_size,\n",
    "    num_samples=train_data.values.partition_shape()[alice][0],\n",
    "    is_secure_generator=False,\n",
    ")\n",
    "dp_strategy_alice = DPStrategy(embedding_dp=gaussian_embedding_dp)\n",
    "label_dp = LabelDP(eps=64.0)\n",
    "dp_strategy_bob = DPStrategy(label_dp=label_dp)\n",
    "dp_strategy_dict = {alice: dp_strategy_alice, bob: dp_strategy_bob}\n",
    "dp_spent_step_freq = 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "sl_model = SLModel(\n",
    "    base_model_dict=base_model_dict, \n",
    "    device_y=alice,  \n",
    "    model_fuse=model_fuse,\n",
    "    dp_strategy_dict=dp_strategy_dict,)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(           age       job  marital  education\n",
       " 1426  0.279412  0.181818      0.5   0.333333\n",
       " 416   0.176471  0.636364      1.0   0.333333\n",
       " 3977  0.264706  0.000000      0.5   0.666667\n",
       " 2291  0.338235  0.000000      0.5   0.333333\n",
       " 257   0.132353  0.909091      1.0   0.333333\n",
       " ...        ...       ...      ...        ...\n",
       " 1508  0.264706  0.818182      1.0   0.333333\n",
       " 979   0.544118  0.090909      0.0   0.000000\n",
       " 3494  0.455882  0.090909      0.5   0.000000\n",
       " 42    0.485294  0.090909      0.5   0.333333\n",
       " 1386  0.455882  0.636364      0.5   0.333333\n",
       " \n",
       " [905 rows x 4 columns],\n",
       "       y\n",
       " 1426  0\n",
       " 416   0\n",
       " 3977  0\n",
       " 2291  0\n",
       " 257   0\n",
       " ...  ..\n",
       " 1508  0\n",
       " 979   0\n",
       " 3494  0\n",
       " 42    0\n",
       " 1386  0\n",
       " \n",
       " [905 rows x 1 columns])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(pid=128496)\u001b[0m 2022-09-07 15:09:30.126909: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n",
      "\u001b[2m\u001b[36m(pid=128494)\u001b[0m 2022-09-07 15:09:30.186352: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n"
     ]
    }
   ],
   "source": [
    "sf.reveal(test_data.partitions[alice].data), sf.reveal(test_label.partitions[alice].data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(           age       job  marital  education\n",
       " 1106  0.235294  0.090909      0.5   0.333333\n",
       " 1309  0.176471  0.363636      0.5   0.333333\n",
       " 2140  0.411765  0.272727      1.0   0.666667\n",
       " 2134  0.573529  0.454545      0.5   0.333333\n",
       " 960   0.485294  0.818182      0.5   0.333333\n",
       " ...        ...       ...      ...        ...\n",
       " 664   0.397059  0.090909      1.0   0.333333\n",
       " 3276  0.235294  0.181818      0.5   0.666667\n",
       " 1318  0.220588  0.818182      0.5   0.333333\n",
       " 723   0.220588  0.636364      0.5   0.333333\n",
       " 2863  0.176471  0.363636      1.0   0.666667\n",
       " \n",
       " [3616 rows x 4 columns],\n",
       "       y\n",
       " 1106  0\n",
       " 1309  0\n",
       " 2140  1\n",
       " 2134  0\n",
       " 960   0\n",
       " ...  ..\n",
       " 664   0\n",
       " 3276  0\n",
       " 1318  0\n",
       " 723   0\n",
       " 2863  0\n",
       " \n",
       " [3616 rows x 1 columns])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m 2022-09-07 15:09:32.495262: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m 2022-09-07 15:09:32.495291: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m 2022-09-07 15:09:32.673837: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m 2022-09-07 15:09:32.673871: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m Model: \"sequential\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m  dense (Dense)               (None, 100)               1300      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m  dense_1 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m Total params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m Trainable params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128496)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Model: \"sequential\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  dense (Dense)               (None, 100)               500       \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  dense_1 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Total params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Trainable params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Model: \"model\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m __________________________________________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  Layer (type)                   Output Shape         Param #     Connected to                     \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m ==================================================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  input_2 (InputLayer)           [(None, 64)]         0           []                               \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  input_3 (InputLayer)           [(None, 64)]         0           []                               \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  concatenate (Concatenate)      (None, 128)          0           ['input_2[0][0]',                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                   'input_3[0][0]']                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  dense_2 (Dense)                (None, 64)           8256        ['concatenate[0][0]']            \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m  dense_3 (Dense)                (None, 1)            65          ['dense_2[0][0]']                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m ==================================================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Total params: 8,321\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Trainable params: 8,321\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=128494)\u001b[0m __________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "sf.reveal(train_data.partitions[alice].data), sf.reveal(train_label.partitions[alice].data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(_run pid=3681860)\u001b[0m 2022-07-13 08:27:48.176592: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib:/opt/rh/gcc-toolset-11/root/usr/lib64/dyninst:/opt/rh/gcc-toolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(_run pid=3681862)\u001b[0m 2022-07-13 08:27:48.189799: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib:/opt/rh/gcc-toolset-11/root/usr/lib64/dyninst:/opt/rh/gcc-toolset-11/root/usr/lib/dyninst\n",
      " 76%|███████▌  | 22/29 [00:00<00:00, 49.58it/s]\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m 2022-07-13 08:27:49.935229: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib:/opt/rh/gcc-toolset-11/root/usr/lib64/dyninst:/opt/rh/gcc-toolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m 2022-07-13 08:27:49.935259: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Model: \"sequential\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  dense (Dense)               (None, 100)               1300      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  dense_1 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Total params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Trainable params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Model: \"sequential_1\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  dense_2 (Dense)             (None, 100)               1300      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m  dense_3 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Total params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Trainable params: 7,764\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681860)\u001b[0m _________________________________________________________________\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m 2022-07-13 08:27:50.103144: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib:/opt/rh/gcc-toolset-11/root/usr/lib64/dyninst:/opt/rh/gcc-toolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m 2022-07-13 08:27:50.103181: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Model: \"sequential\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense (Dense)               (None, 100)               500       \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense_1 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Total params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Trainable params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Model: \"sequential_1\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  Layer (type)                Output Shape              Param #   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense_2 (Dense)             (None, 100)               500       \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense_3 (Dense)             (None, 64)                6464      \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                  \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m =================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Total params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Trainable params: 6,964\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m _________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Model: \"model\"\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m __________________________________________________________________________________________________\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  Layer (type)                   Output Shape         Param #     Connected to                     \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m ==================================================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  input_3 (InputLayer)           [(None, 64)]         0           []                               \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  input_4 (InputLayer)           [(None, 64)]         0           []                               \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  concatenate (Concatenate)      (None, 128)          0           ['input_3[0][0]',                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                   'input_4[0][0]']                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense_4 (Dense)                (None, 64)           8256        ['concatenate[0][0]']            \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m  dense_5 (Dense)                (None, 1)            65          ['dense_4[0][0]']                \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m                                                                                                   \n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m ==================================================================================================\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Total params: 8,321\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Trainable params: 8,321\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m Non-trainable params: 0\n",
      "\u001b[2m\u001b[36m(PYUSLTFModel pid=3681862)\u001b[0m __________________________________________________________________________________________________\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 29/29 [00:03<00:00,  8.82it/s, epoch: 0/10 -  train_loss:0.42108118534088135  train_accuracy:0.8505681753158569  train_auc_2:0.546562671661377  val_loss:0.3594009280204773  val_accuracy:0.8877212405204773  val_auc_2:0.6065167188644409 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 20.75it/s, epoch: 1/10 -  train_loss:0.3594178855419159  train_accuracy:0.8831877708435059  train_auc_2:0.6193124055862427  val_loss:0.3354310989379883  val_accuracy:0.8877212405204773  val_auc_2:0.6627764701843262 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 20.43it/s, epoch: 2/10 -  train_loss:0.33101484179496765  train_accuracy:0.8888274431228638  train_auc_2:0.6729121208190918  val_loss:0.3168722987174988  val_accuracy:0.8877212405204773  val_auc_2:0.7386461496353149 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 22.91it/s, epoch: 3/10 -  train_loss:0.30363929271698  train_accuracy:0.8918694853782654  train_auc_2:0.7538092732429504  val_loss:0.29769349098205566  val_accuracy:0.8877212405204773  val_auc_2:0.792555570602417 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 23.11it/s, epoch: 4/10 -  train_loss:0.2974799871444702  train_accuracy:0.88606196641922  train_auc_2:0.7942224740982056  val_loss:0.27211683988571167  val_accuracy:0.8874446749687195  val_auc_2:0.8455085754394531 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 23.23it/s, epoch: 5/10 -  train_loss:0.2701774835586548  train_accuracy:0.8856502175331116  train_auc_2:0.8522427082061768  val_loss:0.2566554844379425  val_accuracy:0.8918694853782654  val_auc_2:0.862573504447937 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 23.72it/s, epoch: 6/10 -  train_loss:0.25473421812057495  train_accuracy:0.8919214010238647  train_auc_2:0.8635954856872559  val_loss:0.24478046596050262  val_accuracy:0.8979535102844238  val_auc_2:0.8796356916427612 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 23.65it/s, epoch: 7/10 -  train_loss:0.23962831497192383  train_accuracy:0.8991539478302002  train_auc_2:0.8786008954048157  val_loss:0.24174632132053375  val_accuracy:0.8960176706314087  val_auc_2:0.8820918202400208 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 24.32it/s, epoch: 8/10 -  train_loss:0.24154597520828247  train_accuracy:0.8996636867523193  train_auc_2:0.880363941192627  val_loss:0.28965064883232117  val_accuracy:0.8938053250312805  val_auc_2:0.864720344543457 ]\n",
      "100%|██████████| 29/29 [00:01<00:00, 22.25it/s, epoch: 9/10 -  train_loss:0.2731249928474426  train_accuracy:0.8942201137542725  train_auc_2:0.8479944467544556  val_loss:0.24363426864147186  val_accuracy:0.8982300758361816  val_auc_2:0.8842920660972595 ]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'train_loss': [0.4210812,\n",
       "  0.3594179,\n",
       "  0.33101484,\n",
       "  0.3036393,\n",
       "  0.29748,\n",
       "  0.27017748,\n",
       "  0.25473422,\n",
       "  0.23962831,\n",
       "  0.24154598,\n",
       "  0.273125],\n",
       " 'train_accuracy': [0.8505682,\n",
       "  0.8831878,\n",
       "  0.88882744,\n",
       "  0.8918695,\n",
       "  0.88606197,\n",
       "  0.8856502,\n",
       "  0.8919214,\n",
       "  0.89915395,\n",
       "  0.8996637,\n",
       "  0.8942201],\n",
       " 'train_auc_2': [0.5465627,\n",
       "  0.6193124,\n",
       "  0.6729121,\n",
       "  0.7538093,\n",
       "  0.7942225,\n",
       "  0.8522427,\n",
       "  0.8635955,\n",
       "  0.8786009,\n",
       "  0.88036394,\n",
       "  0.84799445],\n",
       " 'val_loss': [0.35940093,\n",
       "  0.3354311,\n",
       "  0.3168723,\n",
       "  0.2976935,\n",
       "  0.27211684,\n",
       "  0.25665548,\n",
       "  0.24478047,\n",
       "  0.24174632,\n",
       "  0.28965065,\n",
       "  0.24363427],\n",
       " 'val_accuracy': [0.88772124,\n",
       "  0.88772124,\n",
       "  0.88772124,\n",
       "  0.88772124,\n",
       "  0.8874447,\n",
       "  0.8918695,\n",
       "  0.8979535,\n",
       "  0.8960177,\n",
       "  0.8938053,\n",
       "  0.8982301],\n",
       " 'val_auc_2': [0.6065167,\n",
       "  0.66277647,\n",
       "  0.73864615,\n",
       "  0.7925556,\n",
       "  0.8455086,\n",
       "  0.8625735,\n",
       "  0.8796357,\n",
       "  0.8820918,\n",
       "  0.86472034,\n",
       "  0.88429207]}"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sl_model.fit(train_data,\n",
    "             train_label,\n",
    "             validation_data=(test_data,test_label),\n",
    "             epochs=10, \n",
    "             batch_size=train_batch_size,\n",
    "             shuffle=True,\n",
    "             verbose=1,\n",
    "             validation_freq=1,\n",
    "             dp_spent_step_freq=dp_spent_step_freq,)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's call the evaluation function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Evaluate Processing:: 100%|██████████| 29/29 [00:00<00:00, 72.20it/s, loss:0.24379125237464905 accuracy:0.8979535102844238 auc_2:0.8850194811820984]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'loss': 0.24379125, 'accuracy': 0.8979535, 'auc_2': 0.8850195}\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "global_metric = sl_model.evaluate(test_data, test_label, batch_size=128)\n",
    "print(global_metric)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Contrast to local model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Model\n",
    "The model structure is consistent with the model of split learning above, but only the model structure of Alice is used here. The model definition refers to the code below.\n",
    "#### Data\n",
    "The data also use kaggle's anti-fraud data. Here, we just use Alice's data of the new bank."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tensorflow import keras\n",
    "from tensorflow.keras import layers\n",
    "import tensorflow as tf\n",
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "def create_model():\n",
    "\n",
    "    model = keras.Sequential(\n",
    "        [\n",
    "            keras.Input(shape=4),\n",
    "            layers.Dense(100,activation =\"relu\" ),\n",
    "            layers.Dense(64, activation='relu'),\n",
    "            layers.Dense(64, activation='relu'),\n",
    "            layers.Dense(1, activation='sigmoid')\n",
    "        ]\n",
    "    )\n",
    "    model.compile(loss='binary_crossentropy',\n",
    "                      optimizer='adam',\n",
    "                      metrics=[\"accuracy\",tf.keras.metrics.AUC()])\n",
    "    return model\n",
    "\n",
    "single_model = create_model()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "data process"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/tmp/ipykernel_3681029/3166940482.py:7: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  alice_data.loc[:, 'job'] = encoder.fit_transform(alice_data['job'])\n",
      "/tmp/ipykernel_3681029/3166940482.py:8: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  alice_data.loc[:, 'marital'] = encoder.fit_transform(alice_data['marital'])\n",
      "/tmp/ipykernel_3681029/3166940482.py:9: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  alice_data.loc[:, 'education'] = encoder.fit_transform(alice_data['education'])\n",
      "/tmp/ipykernel_3681029/3166940482.py:10: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  alice_data.loc[:, 'y'] =  encoder.fit_transform(alice_data['y'])\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.preprocessing import MinMaxScaler\n",
    "from sklearn.preprocessing import LabelEncoder\n",
    "\n",
    "encoder = LabelEncoder()\n",
    "alice_data.loc[:, 'job'] = encoder.fit_transform(alice_data['job'])\n",
    "alice_data.loc[:, 'marital'] = encoder.fit_transform(alice_data['marital'])\n",
    "alice_data.loc[:, 'education'] = encoder.fit_transform(alice_data['education'])\n",
    "alice_data.loc[:, 'y'] =  encoder.fit_transform(alice_data['y'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "y = alice_data['y']\n",
    "alice_data = alice_data.drop(columns=['y'],inplace=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "scaler = MinMaxScaler()\n",
    "alice_data = scaler.fit_transform(alice_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_data,test_data = train_test_split(alice_data,train_size=0.8,random_state=random_state)\n",
    "train_label,test_label = train_test_split(y,train_size=0.8,random_state=random_state)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(905, 4)"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/10\n",
      "29/29 [==============================] - 1s 13ms/step - loss: 0.5276 - accuracy: 0.8603 - auc_3: 0.4496 - val_loss: 0.4029 - val_accuracy: 0.8729 - val_auc_3: 0.4166\n",
      "Epoch 2/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3715 - accuracy: 0.8877 - auc_3: 0.4557 - val_loss: 0.3980 - val_accuracy: 0.8729 - val_auc_3: 0.4094\n",
      "Epoch 3/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3652 - accuracy: 0.8877 - auc_3: 0.4300 - val_loss: 0.3924 - val_accuracy: 0.8729 - val_auc_3: 0.4060\n",
      "Epoch 4/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3598 - accuracy: 0.8877 - auc_3: 0.4403 - val_loss: 0.3892 - val_accuracy: 0.8729 - val_auc_3: 0.4206\n",
      "Epoch 5/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3584 - accuracy: 0.8877 - auc_3: 0.4546 - val_loss: 0.3871 - val_accuracy: 0.8729 - val_auc_3: 0.4446\n",
      "Epoch 6/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3572 - accuracy: 0.8877 - auc_3: 0.4705 - val_loss: 0.3853 - val_accuracy: 0.8729 - val_auc_3: 0.4678\n",
      "Epoch 7/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3559 - accuracy: 0.8877 - auc_3: 0.4822 - val_loss: 0.3837 - val_accuracy: 0.8729 - val_auc_3: 0.4922\n",
      "Epoch 8/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3548 - accuracy: 0.8877 - auc_3: 0.4977 - val_loss: 0.3823 - val_accuracy: 0.8729 - val_auc_3: 0.5142\n",
      "Epoch 9/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3538 - accuracy: 0.8877 - auc_3: 0.5101 - val_loss: 0.3811 - val_accuracy: 0.8729 - val_auc_3: 0.5290\n",
      "Epoch 10/10\n",
      "29/29 [==============================] - 0s 4ms/step - loss: 0.3529 - accuracy: 0.8877 - auc_3: 0.5220 - val_loss: 0.3800 - val_accuracy: 0.8729 - val_auc_3: 0.5447\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<keras.callbacks.History at 0x7f1a74578610>"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "single_model.fit(train_data,train_label,validation_data=(test_data,test_label),batch_size=128,epochs=10,shuffle=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Summary\n",
    "The above two experiments simulate a typical vertical scene training problem. Alice and Bob have the same sample group, but each side has only a part of the features. If Alice only uses her own data to train the model, an accuracy of **0.872**, AUC **0.53** model can be obtained. However, if Bob's data are combined, a model with an accuracy of **0.893**  and AUC **0.883** can be obtained."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* This tutorial introduces what is split learning and how to do it in secretFlow  \n",
    "* It can be seen from the experimental data that split learning has significant advantages in expanding sample dimension and improving model effect through joint multi-party training\n",
    "* This tutorial uses plaintext aggregation to demonstrate, without considering the leakage problem of hidden layer. Secretflow provides AggLayer to avoid the leakage problem of hidden layer plaintext transmission through MPC,TEE,HE, and DP. If you are interested, please refer to relevant documents.\n",
    "* Next, you may want to try different data sets, you need to vertically shard the data first and then follow the flow of this tutorial\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.13 ('py3.8')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "66d1547304beaba725027c44e57cc46fc747862fe9496520910412a3187eb35f"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
